And
for that matter, what the dickens is replication anyhow?
Starting with that question, replication is a technique for ensuring that two (or
more) databases both have the same data, and potentially, other objects as well. The
need arises when people who aren’t closely connected need to have access to the same
data and tools. Examples include people
working on stand-alone PCs who need the same data, users who travel with lap-tops,
people in different parts of the world needing the same data, and situations where
the real data cannot be risked being exposed and tinkered with. In
essence, these cases involve sharing data just as with a multi-user database, but
the users aren’t always physically connected. Access and the Office Developer package
provide the tools to deal with all these situations and a few others. We
aren’t going to cover all of the gory details here, as the subject is quite complex,
but we hope to give you a general idea of how things work, and some references to
chase down details when you need to.
So how does replication work?
When a database is replicated, a special version called the design
master is created, and a set of special tables are created to track changes. In
addition, three special fields that are normally not visible are created in each table
to track changes in data. These changes
occur when a replica is created using the Tools/Replication/Create Replica command
for the first time from a database. Any
number of replicas can be created from the design master, or from one of the other
replicas. At the time a replica is created,
it is essentially an exact duplicate of the database it was created from, but as soon
as data is changed in one or the other of the databases, they are no longer the same.
The process of synchronization is used to bring two databases back into agreement. This
process can be initiated from the design master or from a replica. The process takes
all of the changes from each database and updates records as appropriate. Note
that changes in data go both ways, but that design changes to the schema, or to queries,
forms, reports and code must be done at the design master, and then replicated out
to each of the members of the replica set. It
is possible to create what are local objects that only exist in one member of the
replica set, but they are never synchronized. This
may actually be an advantage in some situations.
This sounds great – what’s the downside?
For one thing, replicating a database makes it grow considerably. In
some cases it may grow to two or three times the size of the un-replicated version. You
may also see some degradation in performance when a database is replicated. You
can’t move databases in the replica set to different locations without careful planning
and execution. Different users may see
different data because the databases have not been recently synchronized, and more
importantly, changes to the same set of data may be inconsistent. This
last issue identifies the need for what is called Conflict Resolution, and that process
is the third major component of replication.
Access provides tools to deal with the more common kinds of replication conflicts,
and there are some significant differences in the way various versions handle conflicts. In
particular, Access 95/97 identified conflicts based on two or more changes to a record,
while Access 2000/2002 track changes at the field level, and do not generate a conflict
unless the same field has been changed by two or more users. Note,
however, that the conflict resolution tools make some assumptions about who wins in
various scenarios, and may not always make the right choice in a given situation. For
that reason it is possible to build a custom conflict resolution scheme that more
closely matches the needs of a given user.
Examples of conflict situations that can arise include:
-
One user deletes a record, while another user updates information in that same record
-
Two users update the same record with different information
-
Two users add the same information in different records
-
Two users add different records with the same primary key
Each of these cases requires some sort of resolution strategy, so it is important
to understand the default resolution strategy. For
example, in Access 2000, each replica is given a replica set priority number, and
when all else fails, it resorts to letting the highest priority replica win. We
strongly recommend that before you implement replication you study the default conflict
resolution strategy and decide whether it will be acceptable in your situation. Another
issue to be aware of, is that auto number fields become random rather than sequential,
and may be very large either positive or negative numbers. This
was done to minimize the probability of adding two records with the same primary key
as noted above.
So where can you use replication successfully?
In general, if you have good connectivity across a LAN, you should avoid replication. There
may be cases where you have overloaded or unreliable LANs that would make replication
attractive, but our view is that a better use of resources would be to fix the network. Having
identified where it isn’t effective, these are some cases you might seriously consider
using it:
Creating a warm backup of data – synchronizing
to a replica every few minutes could give you a backup to revert to in case of a hard
drive crash or some other disaster. Be
aware that there are other ways, some automated, for keeping good backups of Access
databases.
Distribution of design changes – it can
be a challenge to deploy design changes to a database that is widely deployed, and
replication can handle that chore reasonably well. In
fact if the database is deployed across a WAN or even in a sneaker or dial-in net,
that may be the best choice. Again there
are alternatives to be considered, such as a split front-end back-end arrangement
with the front-end copied to the workstation (see the tutorial on that subject - Why
Split a Database?), or the use of facilities such as PCAnyWhere,
Citrix and Windows Terminal Server.
Occasionally connected users – replication
seems to be about the only solution that works well in this situation. Most
often it involves a person traveling with a laptop needing to see data and make a
few changes or additions, then returning to the main office and synchronizing with
the master database. Several of our clients
have used this solution with reasonable success.
Widely distributed networks – replication
may or may not work well in this situation depending on the frequency of updates and
who is making them. It has worked successfully
in situations where several locations needed to send data to a central location for
analysis, as well as situations where a central location needs to send out data to
many distant locations. On the other
hand, in situations where data was being changed or added in many different locations,
a complex synchronization process beyond the scope of this tutorial is required, and
may need a good deal of maintenance. If
you are considering replication for this scenario, you should plan on using the Replication
Manager that comes with the Developer version of Office, and you should study its
capabilities in some detail before hand. Again,
with the advent of high speed connectivity via the Internet, you should probably also
consider one of the remote access technologies noted above.
Where do I get more information?
There are a number of resources available – one of the most comprehensive coverages
of replication is “The Access Developer’s Handbook” by Litwin, Getz, et.al. published
by Sybex. It is available in versions
for Access 97, 2000 and 2002, and should be on the bookshelf of any serious database
developer. In addition, there are a number
of Microsoft Knowledge Base articles for you to peruse. For
Access 2000, article 190766 gives
pointers to two white papers that cover replication and synchronization in considerable
detail. Article 208774 describes
the process of implementing replication in some detail. Article 282977 points
to an extensive list of FAQs related to replication. There
are several similar articles for Access 97 and 2002, and there are numerous articles
that deal with specific replication problem situations that can be located by going
to the Microsoft Support web site and searching for replication with the version of
Access you are using.
Finally, we should note that this tutorial has dealt only with replication for Jet
based (i.e. .mdb files) databases. Replication
of SQL Server of Desktop Engine databases works on somewhat different principles which
are described in articles in the knowledge base.